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In recent years, the need of recommender systems has increased to enhance 
user engagement, provide personalized services, and increase revenue, 
especially in the online shopping industry where vast amounts of customer 
data are generated. Collaborative filtering (CF) is the most widely used and 
effective approach for generating appropriate recommendations. However, 
the current CF approach has limitations in addressing common 
recommendation problems such as data inaccuracy recommendations, 
sparsity, scalability, and significant errors in prediction. To overcome these 
challenges, this study proposes a novel hybrid CF method for movie 
recommendations that combines the incremental singular value 
decomposition approach with an item-based ontological semantic filtering 
approach in two phases, online and offline. The ontology-based technique is 
leveraged to enhance the accuracy of predictions and recommendations. 
Evaluating our method on a real-world movie recommendation dataset using 


Semantic filtering precision, F1 scores, and mean absolute error (MAE) demonstrates that our 

Sparsity system generates accurate predictions while addressing sparsity and 
scalability issues in recommendation system. Additionally, our method has 
the advantage of reduced running time. 
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1. INTRODUCTION 

A recommendation system is an intelligent system that captures user behavior on internet portals in 
order to forecast user interest in future online product purchases, movie viewing, or music listening. Because 
of the growing popularity of the internet market, several websites now provide numerous choices to their 
customers. Looking for the appropriate item in which the customer is interested among thousands of items in 
a short amount of time has become quite difficult. Thus, a recommendation system has been introduced to 
meet this challenging issue. A recommendation system may create data based on a user's previous purchase 
history, search patterns, and online behavior. When a new item or user enters the online website, the 
recommender system search for the existing data and recommends the best item based on the customer's 
preferences [1]. 

Recommendation systems, based on their functioning behavior, are categorized into three types of 
recommendation systems to create the most efficient suggestion: content-based recommendation systems, 
collaborative recommendation systems, and hybrid recommendation systems. Approaches such as 
collaborative filtering (CF) and content-based filtering (CBF) have mainly been developed to acquire insight 
into user preferences [2], [3]. Those preferences suggested in the CBF technique depend on their content 
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similarity to items previously scored by the user. While the CF technique takes advantage of the similarity of 
users’ tastes for suggestions [4], [5]. The CF technique is classified into two types: user-based and 
item-based. The similarity between users in the user-based CF method is estimated according to co-rated 
items. In contrast, item-based CF assesses this similarity among items instead of users. Items that users have 
previously admired will pique their attention. Compared to different techniques, the hybrid strategy combines 
the two filtering procedures and has superior prediction accuracy [6]. 

Even though CF has drawn attention owing to its effectiveness and simplicity [4], it still faces the 
following challenges: data sparsity [7], computation time, accuracy of recommendations, scalability, and data 
volume. To address the issues presented by the CF method, the hybrid recommendation strategy employs 
several information filtering techniques. The hybrid filtering approach is designed to produce more efficient 
and accurate recommendations than a single technique. In addition, the hybrid model overcomes the 
disadvantages of a single system by combining many techniques. In this research, we offer a 
recommendation approach that uses both ontological semantic filtering and an incremental algorithm to give 
high scalability when dealing with massive increases in user and item matrix sizes as well as sparsity issues. 
In order to have an accurate prediction with a decreased running time of the recommendations system. The 
rest of the paper is organized as: section 2 examines the relevant works to this research work, the proposed 
approach is described in section 3, section 4 discusses the practical implementation and evaluation of the 
proposed approach, and followed by the final section 5 which concludes the paper. 


2. RELATED WORK 

Several strategies for recommendation systems have been established in prior research. Nowadays, 
recommender systems are essential to speed up internet users’ searches for relevant content. In the area of 
recommender systems, using ontologies as a knowledge base is becoming increasingly popular in modeling tasks, 
inferring new knowledge [8], [9] or computing similarity for recommender systems. Adopting ontologies in 
information systems intends to model information at the semantic level by structuring and organizing a set of 
hierarchical terms or concepts within a domain and modeling the relationships between these sets of terms or 
concepts using a relational descriptor [10], [11]. Recommender systems based on knowledge represented by 
ontologies are then proposed by explicitly soliciting user requirements for these elements and an in-depth 
understanding of the underlying domain for similarity measures and prediction computation. In relation to the 
significant number of published studies [11], enhance user profile representation by implementing an ontology- 
based recommendation system. By introducing domain ontologies into the system, the suggested technique is able 
to uncover relationships between users and their favorite choices regarding items. The authors developed several 
experiments based on offline tests. They also compared the new recommendation approach to collaborative 
approaches. To improve the quality of the recommendations, Hassan et al. [12] used item semantic knowledge. As 
a result, they created a hybrid semantic improved recommendation strategy that combines the inferential ontology- 
based semantic similarity OBSS) with the classic item-based CF method. Kermany and Alizadeh [13] suggested 
multi-criteria recommender systems using adaptive neuro-fuzzy inference system (ANFIS) relies on ontological 
item-based and user demographic information. Their method was tested using Yahoo movies platform dataset. 
Moreover, according to their results, the accuracy of multi-criteria recommendation system can be increased by 
incorporating semantic information. 

Dimensionality reduction techniques have been widely used in the literature of recommendation 
systems. Among the most successful is the dimension reduction method called singular value decomposition 
(SVD) and its variants and principal component analysis (PCA) [14], [15]. These techniques are used to 
reduce the dimensionality of the data, which helps in handling the sparsity of the data and improving the 
efficiency and accuracy of the recommendation process. The literature shows that these techniques have been 
successful in improving the recommendation performance on various datasets, especially after the challenge 
launched by Netflix. Indeed, many works like [16] analyzing the results of the challenge demonstrated the 
superiority in terms of accuracy of approaches after applying dimensionality reduction techniques over CF 
algorithms. Recent research related to our works has also used SVD as a technique in CF for 
recommendation systems. For instance, Wang et al. [17] proposed a CF algorithm that incorporates trust 
between users to improve recommendation accuracy. The algorithm combines the traditional SVD method 
with a trust factor matrix, the results show that it outperforms other state-of-the-art CF methods in terms of 
recommendation accuracy. Nilashi et al. [18] combine CF with ontology-based techniques and 
dimensionality reduction. The proposed recommender system uses ontology and dimensionality reduction 
techniques to improve the accuracy and coverage of CF. It combines semantic similarity and matrix 
factorization to handle the sparsity problem and provide more personalized recommendations. However, the 
use of incremental SVD has been proposed as a way to improve scalability and performance compared to 
non-incremental SVD [19]. Brand [20] uses an incremental SVD approach with incomplete data to solve the 
issue of uncertain new data with missing values and/or affected by correlated noise. In comparison to SVD 
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technique, using incremental SVD in recommendation systems will updates the factorization model using 
only new information instead of recomputing the entire model from scratch, which can be computationally 
expensive and time-consuming for large datasets. As a result, incremental SVD can reduce the training time 
and improve the efficiency of the recommendation system without sacrificing accuracy. Overall, our 
contribution lies in presenting a comprehensive recommendation method that combines dimensionality 
reduction, ontology-based techniques, and incremental SVD to address key challenges in recommendation 
systems. By leveraging these techniques, we aim to improve recommendation accuracy, scalability, and 
efficiency which will ultimately enhancing the user experience and satisfaction. 


3. HYBRID RECOMMENDER SYSTEM PROPOSITION 

Figure 1 shows the diagram illustrating how the proposed recommendation system works. The 
suggested recommendation system aims to provide efficient, scalable, and accurate recommendations. Two 
significant aspects to examine in the suggested system process. In the first phase, several tasks are performed 
during the construction of the recommendation model, such as clustering of items and users based on rating, 
dimensionality reduction using the SVD algorithm, and constructing item-user similarity matrices. First, the 
system is supplied with a user-item matrix that specifies the user's rating given to each item. As a result, item 
clusters must be constructed using fuzzy c means clustering to determine the similarity between items. The 
pairwise similarity between them is computed to regroup items based on similarity. The overall similarity is 
obtained by calculating the item-based and ontology-based similarity averages. 
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Compute 
User-Item Bo 
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Figure 1. Proposed system framework using ontology and incremental SVD 


A new algorithm based on ontologies is suggested to compute the item similarity. Following that, 
we created decomposition matrices using SVD for the user-item cluster. It is worth mentioning that we are 
working on the SVD model for items and users. As a result, in each matrix, similarity computation is 
correctly performed after the matrix decomposition process. After the comparable item clusters have been 
produced, it is proposed to predict a rating for the current user who has yet to rate every item in the system to 
eliminate sparsity in the user-item matrix. The incremental SVD is employed as part of the recommendation 
process' second phase (online phase) to predict and recommend tasks for targeted users and items. We follow 
the same procedure as the item-based suggestion. Finally, in a meaningful way, integrate user-and item-based 
predictions. In the following subsections, the approach is discussed in depth. 


3.1. Preprocessing of data 

The initial step in our research is to preprocess the dataset in order to make it suitable for the 
proposed method. This involves conducting the necessary preparation processes that real-life data typically 
require for analysis. In our approach, we begin by transforming movie ratings into a user-item matrix, often 
referred to as a rating utility matrix. This matrix captures the ratings provided by users for different movies in 
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Figure 2. However, this matrix is typically sparse, meaning many cells are empty as they represent movies 
the user has not rated. CF algorithms typically work with dense matrices, so we need to convert the sparse 
matrix into a dense matrix by applying normalization techniques. The empty cells in the matrix correspond to 
new users, new movies, or movies not rated by anyone. Users who have expressed positive sentiment 
(indicating user preference) towards a movie are assigned ratings of 4 or 5, while users who have shown 
negative sentiment (indicating user disinterest) are assigned ratings of 1 or 2. Therefore, to address item and 
user bias in the ratings, we normalize the ratings using mean normalization. 
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Figure 2. Transform original data to user-item rating matrix 


3.2. Movie ontology 

In this research, we use the movie ontology (MO) created based on the ontology web language 
(OWL) standard and at the Department of Informatics in the University of Zurich [21]. MO elucidates the 
semantic ideas and concepts related to the domains of the films. The class "movie" is the main class and all 
movies are considered instances of it. Many research have demonstrated that using an ontology-based 
semantic approach improves the prediction accuracy of recommendation systems [22], [23]. 


3.3. User-based clustering 

In user clustering, users are grouped based on similar preferences, as determined by their ratings. 
After clustering the users, each cluster's views aggregate is utilized to predict unidentified ratings for target 
users or predict which items they like or dislike. Since clusters contain a restricted number of users, there is 
no need to evaluate all users. Thus, it results in improving performance. 


3.4. Item-based clustering 
3.4.1. Compute ontology-based item similarity 

Ontologies supply immense knowledge on any topic, which might be highly valuable in the 
recommendation system [24]. Most studies ignored ontologies' multilevel and complicated structures and 
used just one feature to determine item similarity according to ontology. For instance, several researchers 
have relied only on a movie's "genre" to identify a related collection of films based on ontology. In the 
context of a movie recommendation system, let's consider Figure 3. In this example, we can assume that CL 
represents the movie class. Within this movie class, we have two attributes: Atl and At2, which could 
represent characteristics such as the release date and copyright information. Additionally, we introduce a 
subclass called SCI, which represents the "movie origin". This subclass includes attributes At3, At4, and At5, 
which correspond to specific regions such as North Africa, Asia, and Europe. By organizing the movie data 
in this hierarchical manner, we can capture more detailed information about movies and their origins. This 
ontology-based approach allows us to categorize and represent movies based on their attributes, enabling 
more sophisticated recommendation algorithms to provide personalized movie suggestions to users. This 
work uses the binary Jaccard similarity coefficient to compute item-based semantic similarity. For two items 
to be similar, their attributes and the attributes of their subclasses must be similar [25]. The average of the 
values is determined using recursive computing to find the similarity between items until the maximum depth 
defined at the beginning is reached. As a result, in (1) to (3) is used to determine ontology-based similarity: 


[ A 

Sim(Cl,- At, Cl; At) = <3 (1) 
š y Sim (Cl;-At(z),Cl;-At (z) 

SIMpntology (Iti, It;) = Ck=1 ( j ) m 


n 


A highly scalable CF recommendation system using ontology and SVD-based ... (Sajida Mhammedi) 


3772, aA ISSN: 2302-9285 


f y Sim(cl;-At(z),Cl yj At(z)) +R Sim (SCl;-At(d),SCl;-At(d))/m 
StMontology (It;, It;) = (3) 


n 


Assuming no attribute in the ontology is a subclass, in (2) is the ontology-based similarity between 
two items /t; and It;. In (1) is the semantic similarity between classes Cl; and Cl; of two items It; and It; for 
a specific attribute At, the total number of attributes is represented by n. In (3) computed if attributes, 
subclass with its attributes exist in the ontology. To determine the ontology-similarity between two objects 
specified by the domain's common. It requires the following two inputs: i) ontology of items with classes, 
properties, and relationships; and ii) I represent the set of all items. The semantic similarity matrix (SSM) is 
computed, which measures semantic similarity between two items based on ontology. 


Ca) 
GD QD = 
Ce) Cus AD 
Figure 3. Example of item's ontology 


3.4.2. Calculation of item similarity using explicit user ratings 

The similarity between items is determined based on explicit ratings supplied by users in the 
user-item rating matrix, where I represent the set of items, U represents the set of users, and Rui represents 
the score given by user u to item I, as seen in Figure 4. The similarity metric that was used in (4): 


(itiy=tt ju)” 
Sim (Iti Ity) = [Eten Gore (4) 


where /t;,, and It;,, represent the values of the ratings given by user u to item It; and item /t;, respectively. 
1<u<l, both items were rated by the total number L of users. 
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Figure 4. User-item rating matrix 


3.4.3. Total item similarity score 
The total similarity between items is obtained by combining the similarity score supplied by 
ontology in (2) and (3), and explicit user ratings in (4). 
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where a + u = 1, the total item similarity matrix (TISM) is generated after calculating the overall similarity 
for each item in the item set using (5). 


3.4.4. Method of item clustering 

Fuzzy c means clustering [26] was employed in this study to group similar items since it works well 
with sparse datasets in the majority of recommendation system. This study considers content-based 
characteristics derived from ontologies combined with user rating data to avoid over generalization, poor 
accuracy, and cluster overlapping that will result from using just one. As detailed in the next section, similar 
items within a cluster are used to predict the target item's score. As a result, the number of items that need to be 
evaluated is significantly fewer than the entire number of items in the system, which increases the system's 
performance [22]. Once the clusters have been constructed, a user-item cluster matrix (UICM) is produced, in 
which U represents the set of users, C represents the centers of all item's clusters, and amz represents the value 
of the average rating supplied by user m to the item of the cluster center z, as illustrated in Figure 5. 
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Figure 5. Construction of UICM from user-item rating matrix 


3.4.5. Prediction for the rating 

Based on the cluster generated, a sorted list of top T similar items is produced for a target item. 
Using the obtained values, the empty cells in the user-item rating matrix for the target user are then filled. 
The rating for each unrated (target) item is anticipated based on the active user's ratings for items comparable 
to that unrated (target) item. Based on (6), we can predict what the rating of an unrated item i will be 
expected from a target user u. 


È jer Similarity (i,j) ratu j 
Pred, ; = ~~ —— W 
i LjeT Similarity (i,j) 


(6) 


Where the similarity score between target item i and item j is Similarity (i, j); rat,,, is the rating for similar item 
j by user u, and T is the total number of similar items considered. In certain cases, the current user may not rate 
the top T similar items for a target item, leaving some empty cells in the user-item matrix after filling it. To 
address this issue, an extended technique might be used to estimate the remaining sparse cells. In this method, 
an active user's rating behavior for other items is taken into account, as well as other users' ratings for the 
unrated (target) item. Using the suggested method, an unrated item I may be predicted by target user u as (7): 


Ima Tum 4 Yp=1pzuT pi (7) 


Esty = a 
ue M n 


Where a and u are control parameters, M is a measure of how many other items U (target user) scored, 
l<m<M, 7%, is the rating provided by u to other items M, n is the number of other users, where 1<p<n, q#u 
is the number of other users who submitted a rating for unrated target item I, and r,„; is the rating given to 
target item i by other users p, excluding target user u. The result for predicting the unrated value in an explicit 
user-item rating matrix UIM(U, I) is a dense, non-sparse user-item rating matrix DUIM(U, I). 
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3.5. Dimensionality reduction 
3.5.1. Singular value decomposition 

According to Zhou et al. [19], one of the standard solutions for sparsity issues is to use data dimension 
reduction techniques, notably SVD, which is a matrix factorization technique that can extract dataset features by 
dividing the original user-item rating matrix into three smaller matrix multiplications. Given a mxn matrix 
A E R™*” (n is the number of items and m is the number of users), the SVD(A) is expressed with the rank(A)=r 
as: SVD(A) = Ux X x VT, where U € R™ 7, V € R, and X € R™". The middle matrix X is a diagonal 
matrix with r nonzero entries, which are the singular values of A. SVD is the best low-rank linear 
approximation of the original matrix, which provide the optimum approximation of the utility matrix A. 


3.5.2. Incremental singular value decomposition algorithm in the prediction task 

The algorithms in the proposed study operate in two stages, online and offline. In the suggested CF 
recommendation system, user-to-user mapping takes place offline. In contrast, the actual rating prediction or 
target user interactions is made online. Offline prediction or recommendation is, in fact, a time-consuming 
procedure. Whereas the online method is efficient in terms of prediction and recommendation time owing to 
the usage of the incremental SVD. The parallel design system for the similarity formation method may be 
made incredibly scalable using SVD size reduction techniques while generating more significant results in 
maximum instances. This study presents incremental SVD algorithms that produce recommendations online 
for target users in the shortest time possible. The incremental algorithm's most essential quality is that it 
supports a high number of users, making the system scalable as the size of the user-item matrix grows. 

Our recommender system operates in two distinct phases. First, the model is developed offline by 
calculating user-user or item-item similarity. Meanwhile, the model generates predictions when a newcomer 
or item is introduced, and the online process begins. In incremental SVD, the projection method is known as 
folding-in. To fold new users into the distance of the previously decreased user-item matrix. For instance, 
Figure 6 shows that after running the SVD method on A1 in the offline process with three matrices U1, £1, 
and V1, the online process uses the incremental approach whenever a new matrix A2 is added, resulting in 
three updated matrices U2, £2, and V2. 


Offline-process 


SVD 
Ay Algorithm Ay = Ui x Zz a) x v 
new matrix Incremental T 
1 Az algorithm Aya, = (U07 x eae Xe 


Online-process 


Figure 6. Phases of recommendation process 


4. EXPERIMENTATIONS AND RESULTS 
4.1. Dataset description 

MovieLens dataset, which can be found at [27], is one of the most well-known datasets for 
evaluating recommender systems. The MovieLens dataset consists of 1 M ratings provided by 6,040 users for 
a total of 3,900 movies. Each rating is expressed on a scale of 1 to 5, where a rating of 1 indicates the least 
liked movie and a rating of 5 represents the most liked movie. The dataset offers a comprehensive collection 
of user reviews, allowing us to evaluate and enhance our recommendation system based on a broad range of 
user preferences and movie ratings. Detailed information about the dataset is presented in the Table 1. 


Table 1. Description of the dataset 
MovieLens 1 M 


User# 6,040 
Items# (movies) 3,900 
Ratings# 1,000,209 
Rang of ratings 1-5 
Sparsity# 95.75% 


Bulletin of Electr Eng & Inf, Vol. 12, No. 6, December 2023: 3768-3779 


Bulletin of Electr Eng & Inf ISSN: 2302-9285 O 3775 


WebSPHINX [28], a web crawler, was used in this study to collect material relevant to IMDb [29] 
items. Furthermore, gathered data is used to construct and complete an item ontology. To conduct tests, the 
dataset was divided into 80% of randomly selected data for the training set, while 20% of the remaining data 
was used for the testing set. 


4.2. Evaluation and discussion of the proposed system 

The recommender system presented in this study was implemented using Python 3.9.7 on a PC with 
a 4 GHz processor, 8 GB RAM, and 64-bit Microsoft Windows 10. To thoroughly assess the system's 
performance, it was compared to various related approaches, including Pearson nearest neighbor algorithm, 
item-based CF with EM, SVD combined with ontology, and user-item-based EM and SVD with and without 
ontology integration. The evaluation was conducted from two perspectives: time throughput 
(recommendations per second) and accuracy, providing valuable insights into the system's efficiency and 
effectiveness compared to existing approaches. 


4.2.1. Evaluation 1: predictive accuracy analysis 

Mean absolute error (MAE) is a statistical accuracy metric used to evaluate prediction accuracy. In 
this experiment, the MAE computes the difference between the predicted and actual ratings. MAE is 
presented in (8): 


predu k-4Ctu,k 


MAE (pred, act) = X}; = 


(8) 


Where Nb determines the number of items on which a user u has given a score, the suggested approach for 
predicting accuracy using MAE is assessed and compared to the state-of-the-art methods. Displayed on 
MovieLens datasets, respectively, against different neighborhood sizes Figure 7. 


0.60 
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(a) MovieLens 
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Figure 7. MAE using MovieLens datasets for all methods 


4.2.2. Evaluation 2: decision-support accuracy 

In terms of accuracy measurements, the decision-support metrics will be crucial in evaluating the overall 
performance of the hybrid-based recommender. In the information retrieval area, several measures for this aim are 
well-known. Recall, precision, and F-measure are among the metrics included in this category. The precision 
computes the fraction of relevant items in the list of returned results. In contrast, the recall calculates the fraction of 
pertinent items that have been retrieved. Both metrics should be used in common since the recall increases as the 
number of items retrieved increases, whereas the precision often decreases as result sizes increase. 


TR 


precision = ——— (9) 
recall = — (10) 
TR+FN 
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The F measure is a metric that takes both values into account, as indicated in (11): 


_ (1+a?). precision - recall 


F1 


(11) 


a2. precision + recall 


The F1 measures and precision values for all methods on various top-N recommendations are shown 
in Table 2. It can be deduced from the table that the precision achieved by the suggested technique is 
significantly higher than that obtained by the nearest neighbor algorithm or the other methods tested. In 
addition, we found that the F1 measures of the proposed method, dealing with dimensionality reduction using 
incremental SVD and ontology, outperformed. Compared to other methods, these findings are sufficient to 
support our claim that our recommendation system is reasonably more efficient and scalable. 


Table 2. Comparison of F1 metric and the precision values for different methods 


Top N Proposed system Method A Method B Method C Method D 
Fl Precision Fl Precision Fl Precision Fl Precision Fl Precision 

Top-5 0.811 0.801 0.797 0.787 0.773 0.771 0.721 0.719 0.583 0.564 
Top-10 0.816 0.805 0.807 0.796 0.784 0.782 0.739 0.736 0.601 0.582 
Top-15 0.83 0.82 0.827 0.816 0.804 0.802 0.749 0.747 0.615 0.592 
Top-20 0.843 0.832 0.833 0.821 0.811 0.809 0.76 0.757 0.622 0.601 
Top-25 0.856 0.845 0.844 0.833 0.825 0.823 0.77 0.769 0.65 0.628 
Top-30 0.845 0.837 0.84 0.831 0.821 0.819 0.762 0.757 0.605 0.603 
Top-35 0.837 0.83 0.832 0.823 0.808 0.806 0.751 0.75 0.59 0.581 
Top-40 0.839 0.828 0.83 0.819 0.803 0.801 0.741 0.739 0.579 0.573 
Top-45 0.833 0.819 0.827 0.818 0.795 0.793 0.732 0.733 0.558 0.556 
Top-50 0.831 0.817 0.822 0.813 0.785 0.783 0.722 0.723 0.546 0.541 
Method A: user-and item-CF based+SVD+EM+ontology Method C: user-and item-CF based+SVD+EM 

Method B: item-CF based+SVD+EM+ontology Method D: pearson nearest neighbor 


4.2.3. Evaluation 3: scalability analysis 

The efficiency of the suggested approach is evaluated in the first experiment. Evaluation is based on 
throughput, known as the number of suggestions per second. We test our strategy on the MovieLens datasets 
to demonstrate its effectiveness in improving the system's scalability problem. Figure 8 illustrates the 
performance results of our method compared to the state-of-the-art methods. 

According to the graph, the throughput of those methods that use dimensionality reduction techniques 
and clustering is considerably higher than other methods. Moreover, the proposed approach based on clustering 
with expectation maximization (EM), ontology similarity, and incremental SVD is slightly higher than other 
methods, especially those that rely on the SVD reduction technique. Unlike systems that use the nearest 
neighbor technique, clustering allows the recommendation system to analyze just a part of the items/users. As a 
result, increasing the cluster size does not affect throughput since it must scan all nearest neighbors. 


Throughput (Recs/Sec) 


Number of Clusters 
(a) MovieLens 


—++— Item-based + SVD + EM+ ontology (C) 


User-and Item-Based+SVD+EM+Ontology (A) 
= Pearson Nearest Neighbor (D) 
===$=== User-and Item-Based+iSVD+EM+Ontology 

(Proposed method) 


— = — User-and item-based+SVD+EM (B) 


Figure 8. Throughput of all methods for MovieLens datasets 
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The results of the evaluations demonstrate the effectiveness and superiority of the proposed 
recommendation system. By incorporating ontology and dimensionality reduction techniques in CF, the 
system achieves improved predictive accuracy, decision-support accuracy, and scalability compared to 
existing methods. It proved that considering semantic relationships and reducing dimensionality enhances the 
system's ability to capture user preferences, provide accurate ratings, and enable the system to handle large-scale 
datasets more effectively. Therefore, this implies that the proposed method not only provides accurate 
recommendations but also ensures that relevant items are retrieved. The system addresses the limitations of 
traditional CF approaches by providing accurate recommendations, assisting users in decision-making, and 
efficiently handling large datasets. This study's findings highlight the proposed system's potential for 
practical applications in the recommendation domain. 


5. CONCLUSION 

In this paper, we have presented a novel recommendation method that addresses the challenges of 
accuracy, scalability, and sparsity in CF-based recommender systems. Our approach incorporates 
dimensionality reduction using the incremental SVD algorithm, ontological item-based semantic similarity, 
and explicit user ratings to improve the prediction accuracy and scalability of the system. By adopting the 
incremental SVD method, we were able to handle the increasing size of the user-item matrix while 
maintaining computational efficiency. The folding-in technique employed in the incremental SVD algorithm 
significantly reduced the computation cost and allowed our system to achieve high scalability. The 
experimental results conducted on a real-world movie recommendation dataset confirmed the effectiveness of 
our proposed method. The precision, Fl measures, and MAE metrics demonstrated that our system provides 
accurate predictions while effectively addressing the sparsity and scalability issues commonly encountered in 
recommender systems. The incorporation of MO further improved the predictive accuracy and expanded the 
potential for applying our method to different semantic contexts and domains. Further research can explore 
additional evaluation metrics, investigate the system's performance with different datasets, and consider the 
impact of incorporating other factors such as user demographics or temporal dynamics. By continuously 
refining and enhancing the recommendation system, we can further improve the accuracy, relevance, and 
usability of the recommendations provided to users in various domains. 
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